On Computing Condensed Frequent Pattern Bases
نویسندگان
چکیده
Frequent pattern mining has been studied extensively. However, the effectiveness and efficiency of this mining is often limited, since the number of frequent patterns generated is often too large. In many applications it is sufficient to generate and examine only frequent patterns with support frequency in close-enough approximation instead of in full precision. Such a compact but close-enough frequent pattern base is called a condensed frequent patterns-base. In this paper, we propose and examine several alternatives at the design, representation, and implementation of such condensed frequent pattern-bases. A few algorithms for computing such pattern-bases are proposed. Their effectiveness at pattern compression and their efficient computation methods are investigated. A systematic performance study is conducted on different kinds of databases, which demonstrates the effectiveness and efficiency of our approach at handling frequent pattern mining in large databases.
منابع مشابه
An Efficient One-pass Method for Discovering Bases of Recently Frequent Episodes over Online Data Streams
The knowledge embedded in an online data stream is likely to change over time due to the dynamic evolution of the stream. Consequently, in frequent episode mining over an online stream, frequent episodes should be adaptively extracted from recently generated stream segments instead of the whole stream. However, almost all existing frequent episode mining approaches find episodes frequently occu...
متن کاملA new algorithm for computing SAGBI bases up to an arbitrary degree
We present a new algorithm for computing a SAGBI basis up to an arbitrary degree for a subalgebra generated by a set of homogeneous polynomials. Our idea is based on linear algebra methods which cause a low level of complexity and computational cost. We then use it to solve the membership problem in subalgebras.
متن کاملSeparating Structure from Interestingness
Condensed representations of pattern collections have been recognized to be important building blocks of inductive databases, a promising theoretical framework for data mining, and recently they have been studied actively. However, there has not been much research on how condensed representations should actually be represented. In this paper we propose a general approach to build condensed repr...
متن کاملEfficient Frequent Pattern Mining Based on a Condensed Tree Structure
In this paper, we present an efficient tree structure and its associated algorithm for discovery of frequent patterns from a large data set. We demonstrate the effectiveness of our algorithm and performance improvement over the existing approach CATS which is one of the fastest frequent pattern mining algorithms known to date.
متن کاملReprésentation condensée en présence de valeurs manquantes
Missing values are an old problem that is very common in real data bases. We describe the damages caused by missing values on condensed representations of patterns extracted from large data bases. This is important because condensed representations are very useful to increase the efficiency of the extraction and enable new uses of patterns (e.g., rules with minimal body, clustering, classificat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002